Assessing coupling dynamics from an ensemble of time series 



O 

(N 

< 



<S1 



> 

m 
in 
p 

O 
o 



X 



German Gomez-Herrero*, Wei Wu^, Kalle Rutanen*, Miguel C. Soriano§, Gordon Pipa^, and Raul Vicente^ 
* Department of Signal Processing, Tampere University of Technology, PO 553, 33101 Tampere, Finland 

' Max Planck Institute for Brain Research, 60528 Frankfurt am Main, Germany and 
^IFISC, Instituto de Fisica Interdisciplinar y Sistemas Complejos, E-07122, Palma de Mallorca, Spain 

(Dated: August 4, 2010) 

Finding interdependency relations between (possibly multivariate) time series provides valuable 
knowledge about the processes that generate the signals. Information theory sets a natural frame- 
work for non-parametric measures of several classes of statistical dependencies. However, a reliable 
estimation from information-theoretic functionals is hampered when the dependency to be assessed 
is brief or evolves in time. Here, we show that these limitations can be overcome when we have 
access to an ensemble of independent repetitions of the time series. In particular, we gear a data- 
efficient estimator of probability densities to make use of the full structure of trial-based measures. 
By doing so, we can obtain time-resolved estimates for a family of entropy combinations (including 
mutual information, transfer entropy, and their conditional counterparts) which are more accurate 
than the simple average of individual estimates over trials. We show with simulated and real data 
that the proposed approach allows to recover the time-resolved dynamics of the coupling between 
different subsystems. 

PACS numbers: Valid PACS appear here 



A technical problem that arises in various fields of sci- 
ence is to detect interdependencies between simultane- 
ously measured time series. Namely, the detection of 
interdependencies is often the first step for elucidating 
how the subsystems that underly the time series inter- 
act. For example, in neuroscience, ecology, or economet- 
rics this approach has lead to the discovery of new neural 
codes [1] , better models of population dynamics [2] , and 
methods to assess the influence of an economic variable 
[3], respectively. Tools to unveil an interdependency in- 
clude linear techniques, such as cross-correlation and co- 
herency analysis [4], non- linear synchrony measures [5], 
and the evaluation of statistical dependencies via mutual 
information (MI) [6]. Although useful for assesing the 
strength of the interaction, these indices do not allow to 
identify directionality (i.e. cause-response relationships). 
The latter are arguably more important, if one aims to 
understand the functioning of a system at the mechanis- 
tic level. In this paper we propose a method to reliably 
estimate the temporal course of directed interactions, as 
revealed by several information-theoretic functionals. 

While causality is a broad concept, Wiener formulated 
an operative definition leaning on the idea that the cause 
occurs before the effect and, therefore, knowledge of the 
cause helps forecasting the effect [7]. The widely used 
Granger causality is the mathematical formalization of 
Wiener's definition in terms of linear regressions [4]. Al- 
ternatively, when a model of the underlying dynamics 
and of the interaction is not available, a sound non- 
parametric approach can be stated in terms of informa- 
tion theory. A prototypical example is transfer entropy 
(TE), which quantifies, in terms of a Kullback-Leibler di- 
vergence, how much the present and past of one system 
condition (i.e., cause in a Wiener sense) the dynamics of 
another [8]. Nevertheless, the non-parametric or model- 
freeness nature of a measure does not usually come for 



free. A practical pitfall, common to most information- 
theoretic approaches, is that they require many obser- 
vations to be reliably estimated. This requisite directly 
confronts with situations in which the dependency to be 
analyzed evolves in time or is subjected to fast transients. 
When the non-stationarity is only due to a slow change 
of a parameter, over-embedding techniques can partially 
solve the problem by capturing the slow dynamics of the 
parameter as an additional variable [9] . It is also habitual 
to de-trend the time series or divide them into small win- 
dows within which the signals can be considered as ap- 
proximately stationary. However, the above-mentioned 
procedures become unpractical when the relevant inter- 
actions change in a fast time scale. This is the common 
situation in brain responses and other complex systems 
where an external stimuli elicit a rapid functional reor- 
ganization of information-processing pathways. 

Fortunately, in several disciplines the experiments 
leading to the multivariate time series can be systemati- 
cally repeated. Thus, a typical experimental paradigm 
might render an ensemble of presumably independent 
repetitions or trials per experimental condition. In this 
letter we show that this multi-trial nature can be ex- 
ploited to produce time-resolved estimates for a family of 
information-theoretic measures that we call entropy com- 
binations. This family includes well-known functionals 
such as MI, TE, and their conditional counterparts: par- 
tial mutual information (PMI) [10] and partial transfer 
entropy (PTE) [11, 12]. We use simulations and experi- 
mental data to demonstrate that the proposed ensemble 
estimators of entropy combinations are much more accu- 
rate than simple averaging of individual trial estimates. 

We consider three simultaneously measured time se- 
ries generated from stochastic processes X, Y, and Z 
which can be approximated as stationary Markov pro- 
cesses [13] of finite order. The state space of X can 
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then be reconstructed using the delay embedded vectors 
x(n) = (x(n), x(n — d x + 1)) for n = 1, . . . , N, where 
n is a time index and dc is the corresponding Markov 
order. Similarly we could construct y(n) and z(n) for 
processes F and respectively. Let V = (Vi,...,V m ) 
denote a random m-dimensional vector. Then, an en- 
tropy combination is defined by: 

c(v Cl , v Cp ) = ]T - £f(v) (i) 

i=l 

where Vi G : A C [l,m] and G {—1,1} such 

that X^i=i s iX£i — X[i,m] where xs is the characteristic 
function of a set 5. It can be easily checked that MI, TE, 
PMI and PTE are all entropy combinations: 

Ix^Y = —HxY + Hx + Hy 

Tx^-y = —Hwxy + Hwx + Hxy — Hx 
Ix^y\z = —Hxzy + Hxz + Hzy — Hz 
Tx^y\z = —Hwxzy + Hwxz + Hxzy — Hxz 

where W = X + = x(n+l) so that Hwx is the differential 
entropy of p(x(n + 1), x(n)). The latter denotes the joint 
probability of finding X at states x(n + 1) , x(n) x (n — 
d x + 1) during time instants n + 1, n, n — 1, n — d x + 
1. Notice that, due to stationarity, p(x(n + l),x(n)) is 
invariant under variations of the time index n. 

A straightforward approach to the estimation of en- 
tropy combinations would be to add separate estimates 
of each of the involved multi-dimensional entropies. Pop- 
ular estimators of differential entropy include plug-in 
estimators and fixed and adaptive histogram or parti- 
tion methods. However, other non-parametric techniques 
such as kernel and nearest-neighbor estimators have been 
shown to be extremely more data-efficient and accu- 
rate [14]. An asymptotically unbiased estimator based 
on nearest-neighbor statistics is by Kozachenko and Leo- 
nenko (KL) [15]. For N realizations x[l], x[2], x[N] 
of a ci-dimensional random vector X, the KL estimator 
takes the form: 

d N 

H X = -m + V(A0 + logM + - lo S( e M) ( 2 ) 

i=l 

where ip is the digamma function, is the volume of the 
rf-dimensional unit ball, and e{i) is the distance from x[i] 
to its kth nearest neighbor in the set {x[j]} v - , v The KL 
estimator is based on the assumption that the density of 
the distribution of random vectors is constant within an 
e-ball. The bias of the final entropy estimate depends on 
the validity of this assumption, and thus, on the values of 
e(n). Since the size of the e-balls depends directly on the 
dimensionality of the random vector, the biases of esti- 
mates for the differential entropies in (1) will, in general, 
not cancel, leading to a poor estimator of the entropy 



combination. This problem can be partially overcome by 
noticing that (2) holds for any value of k so that we do not 
need to have a fixed k. Therefore, we can vary the value 
of k in each data point so that the radius of the corre- 
sponding e-balls would be approximately the same for the 
joint and the marginal spaces. This idea was originally 
proposed in [16] for estimating mutual information, was 
used in [17] to estimate PMI, and we generalize it here 
to the following estimator of entropy combinations: 

C(V Cl , V Cp ) = F(k) -j2 s i( F (ki(n))) n (3) 

i=l 

where F(k) = V(fc) - and (•••)„ = £ £^ =1 (- • • ) 

denotes averaging with respect to the time index. The 
term fc,(n) accounts for the number of neighbors of the 
nth realization of the marginal vector V/v located at a 
distance strictly less than e(n), where e(n) denotes the 
radius of the e-ball in the joint space. The point itself is 
included in this counting. 

A fundamental limitation of estimator (3) is the as- 
sumption that the involved multidimensional distribu- 
tions are stationary. However, this is hardly the case 
in many real applications and time-adaptation becomes 
crucial in order to obtain meaningful estimates. A trivial 
solution is to use the following time- varying estimator of 
entropy combinations: 

C({V Cl , V Cp }, n) = F(k) - (h(n)) (4) 

i=l 

This naive time-adaptive estimator is not useful in 
practice due to its large variance, which stems from the 
fact that a single data point is used for producing the es- 
timate at each time instant. However, let us consider the 
case of an ensemble of r' repeated measurements (trials) 
from the dynamics of V. Let us also denote by { v( r ) [n] } r 
the measured dynamics for those trials (r = 1,2,.../). 

Similarly, we denote by {v^ [n] } r the measured dynam- 
ics for the marginal vector V/v- A straightforward ap- 
proach for integrating the information from different tri- 
als is to average together estimates obtained from indi- 
vidual trials: 

C^({V Cl , V Cp }, «) = ^E d(r) ^ < ^p>- n ) 

r— 1 

where C^{{Vd , Vc p }, n) is the estimate obtained 
from the rth trial. However, this approach makes a poor 
use of the available data and will typically produce use- 
less estimates, as will be shown in the experimental sec- 
tion of this chapter. A more effective procedure takes 
into account the multi-trial nature of our data by search- 
ing for neighbors across ensemble members, rather than 
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FIG. 1: Nearest neighbor statistics across trials, a) For each 
time instant n = n* and trial r = r* , we compute the (maxi- 
mum norm) distance e' r '(n*) from v' r '[n*] to its fc-th near- 
est neighbor among all trials. Here the procedure is illustrated 
for k = 5. b) k[ r ' [n*] counts how many neighbors of v' r ' [n*] 

are within a radius e r (n*). The point itself (i.e. v^ r [n*]) 
is also included in this count. These neighbor counts are ob- 
tained for all i = 1, ...p marginal trajectories. 



from within each individual trial. This nearest ensemble 
neighbors [18] approach is illustrated in Fig. 1 and leads 
to the following ensemble estimator of entropy combina- 
tions: 



C™({V Cl ,...,V Cp },n) = F(k) -~£$>F ( k * r){n) 



where the counts of marginal neighbors 

{fc| r ^(w)}vi=i'"p r are computed using overlapping 
time- windows of size 2a, as shown in Fig. 1. For rapidly 
changing connectivity patterns, small values of a might 
be needed to track the coupling dynamics while larger 
values of a will lead to lower estimator variance. 

To demonstrate that en can be used to character- 
ize dynamic coupling patterns we simulated three non- 
linearly coupled autoregressive processes with a time- 
varying coupling factor: 

x r [n] = 0Ax r [n -l] + r) x , 

y r [n] = 0.5y r [n - 1 \ + K yx [n] sin(a; r [n - T yx \) + r) y , 
z r [n] = 0.5z r [n - 1 \ + n zy [n] sin (y r [n - T zy \) + r] z . 

during 1500 time steps and repeated R=50 trials with 
new initial conditions. The terms r\ Xl r\ y and r\ z represent 
normally distributed noise processes, which are mutually 
independent across trials and time instants. The coupling 
delays amount to r yx = 10, r zy = 15 while the dynamics 
of the coupling follows a sinusoidal variation: 
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FIG. 2: Partial transfer entropy between three non-linearly 
coupled Gaussian processes. The solid lines represent PTE 
values while the color-matched dashed lines denote corre- 
sponding p = 0.05 significance levels. The temporal vari- 
ance of the PTE estimates was reduced with a post-processing 
moving average filter of order 20. 



Before PTE estimation each time-series was time- 
delayed so that they had maximal mutual information 
with the destination of the flow. That is, before comput- 
ing some r a< _ b | c (n), the time-series b and c were delayed 
so that they shared maximum information with the time- 
series a. To assess the statistical significance of the PTE 
values (at each time-instant) we applied a permutation 
test with surrogate data generated by randomly shuffling 
trials [19]. Fig. 2 shows the time- varying PTEs obtained 
for these data with the ensemble estimator of entropy 
combinations given in Eq. 5. Indeed, the PTE analysis 
accurately describes the underlying interaction dynam- 
ics. In particular, it captures both the onset/offset and 
the oscillatory profile of the effective coupling across the 
three processes. On the other hand, the naive average 
estimator (5) did not reveal any significant flow of infor- 
mation between the three time-series (see supplementary 
material) . 

To evaluate the robustness and performance of the en- 
tropy combination estimator to real levels of noise and 
measurements variability, we also present a second exam- 
ple derived from experimental data on electronic circuits. 
The system consists of two nonlinear Mackey- Glass cir- 
cuits unidirectionally coupled through their voltage vari- 
ables. The master circuit is additionally subject to a 
feedback loop responsible for generating high dimensional 
chaotic dynamics. A time-varying effective coupling is 
then induced by periodically modulating the strength of 
the coupling between circuits as controlled by an external 
CPU. In this case, we applied transfer entropy between 
the voltage signals generated from the two circuits for 180 
trials, each 1000 sampling times long. Figure 3 shows the 
TE estimates obtained with (5) versus the temporal lag 
introduced between the two voltage signals (intended to 
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FIG. 3: Transfer entropy from the first electronic circuit to- 
wards the second. The upper figure shows time-varying TE 
versus the lag introduced in the temporal activation of the 
first circuit. Clearly, there is a directional flow of informa- 
tion time-locked at lag r = 20 samples, which is significant 
for all time-instants (p < 0.01). On the other hand, the flow 
of information in the opposite direction was much smaller 
(Ti<-2 < 0.0795 nats V(t,r)) and only reached significance 
(p < 0.01) for about 1% of the tuplas (n, r). The lower fig- 
ure shows that the temporal pattern of information flow for 
t = 20, i.e. T2i-i(n,T = 20), which resembles a sinusoid with 
a period of roughly 100 data samples. 



scan the unknown coupling delay). The results show that 
the TE estimates capture perfectly the dynamics of the 
effect exerted by the master circuit on the slave circuit. 
On the other hand, no significant coupling is detected in 
the reverse direction (see supplementary material). Both 
the period of the coupling dynamics (100 samples) and 



the coupling delay (20 samples) can be accurately recov- 
ered from Fig. 3. 

In conclusion, we have introduced an ensemble esti- 
mator of entropy combinations that is able to detect 
time-varying information flow between dynamical sys- 
tems, provided that an ensemble of repeated measure- 
ments is available for each system. The proposed ap- 
proach allows to construct time-adaptive estimators of 
MI, PMI, TE and PTE, which are the most common 
information-theoretic measures for dynamical coupling 
analyses. Using simulations and real physical measure- 
ments from electronic circuits we showed that these new 
estimators can accurately describe multivariate coupling 
dynamics. It is important to mention that intrinsic to 
our approach is the assumption that the evolution of 
the interdependencies to be detected are to some de- 
gree "locked" to the trial onset. This is typically the 
case when some controlled external perturbation induce 
or evokes the interactions across the subsystems mea- 
sured. The degree of locking determines the maximum 
temporal resolution achievable by the method (which is 
controlled via a). Nevertheless, alignment techniques can 
help to reduce the possible jitter across trials and thus 
increase the resolution. The methods presented here are 
general but we anticipate that a potential application is 
the analysis of the mechanisms underlying the generation 
of event-related brain responses and the seasonal varia- 
tions of geophysical variables. To promote dissemination 
we have publicly released a software library that includes 
efficient implementations of these and other information- 
theoretic methods [20]. 
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